A Corpus Balancing Method for Language Model Construction
نویسندگان
چکیده
The language model is an important component of any speech recogn ition system. In this paper, we present a lexical enrichment methodology of corpora focused on the construction of statistical language models. This methodology considers, on one hand, the identification of the set of poor represented words of a given training corpus, and on the other hand, the enrichment of the given corpus by the repetitive inclusion of selected text fragments containing these words. The first part of the paper describes the formal details about this methodology; the second part presents some experiments and results that validate our method.
منابع مشابه
Cultural Influence on the Expression of Cathartic Conceptualization in English and Spanish: A Corpus-Based Analysis
This paper investigates the conceptualization of emotional release from a cognitive linguistics perspective (Cognitive Metaphor Theory). The metaphor weeping is a means of liberating contained emotions is grounded in universal embodied cognition and is reflected in linguistic expressions in English and Spanish. Lexicalization patterns which encapsulate this conceptualization i...
متن کاملAllophone-based acoustic modeling for Persian phoneme recognition
Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...
متن کاملAn Exploration of Discoursal Construction of Identity in Academic Writing
The view that academic writing is purely objective, impersonal and informational, which is often reflected in English for Academic Purposes materials, has been criticized by a number of researchers. By now, the view of academic writing as embodying interaction among writers, readers and the academic community as a whole has been established. Following this assumption, the present study focused ...
متن کاملConstruction of spoken language model including fillers using filler prediction model
This paper proposes a novel method to construct a spoken language model including fillers from a corpus including no fillers using a filler prediction model. It consists of two submodels: a filler insertion model which predicts places where fillers should be inserted, and a filler selection model which predicts appropriate fillers for given places. It converts a corpus that covers domain-releva...
متن کاملDynamic Modeling and Construction of a New Two-Wheeled Mobile Manipulator: Self-balancing and Climbing
Designing the self-balancing two-wheeled mobile robots and reducing undesired vibrations are of great importance. For this purpose, the majority of researches are focused on application of relatively complex control approaches without improving the robot structure. Therefore, in this paper we introduce a new two-wheeled mobile robot which, despite its relative simple structure, fulfills the req...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003